Skip to content

Conversation

volsa
Copy link
Member

@volsa volsa commented Jul 17, 2025

This PR is the first of two introducing polymorphism to the compiler. This PR focuses on polymorphism for classes and function blocks whereas the second will focus on introducing interfaces and validations. For reviewing this monster of a PR, best to start reading either the tests or lowering/vtable.rs and lowering/polymorphism.rs before exploring other files.

In short, polymorphism enables dynamic dispatch allowing for user code such as the following to execute

FUNCTION_BLOCK A
    METHOD getName: STRING
        getName := 'A';
    END_METHOD

    METHOD printName
        printf('name = %s$N', ADR(getName()));
    END_METHOD
END_FUNCTION_BLOCK

FUNCTION_BLOCK B EXTENDS A
    METHOD getName: STRING
        getName := 'B';
    END_METHOD

    METHOD persistName
        // Persist name to some file
    END_METHOD
END_FUNCTION_BLOCK

FUNCTION main
    VAR
        result: STRING;
        instanceA: A;
        instanceB: B;

        refInstanceA: POINTER TO A;
    END_VAR

    refInstanceA := ADR(instanceA);
    result := refInstanceA^.getName();   // Calls `A::getName` yielding "name = A"
    refInstanceA^.printName();           // Calls `A::printName` which calls `A::getName` yielding "name = A"

    refInstanceA := ADR(instanceB);
    result := refInstanceA^.getName();   // Calls `B::getName` yielding "name = B"
    refInstanceA^.printName();           // Calls `A::printName` which calls `B::getName` yielding "name = B"
END_FUNCTION

This is achieved by calling methods indirectly through a virtual table. Any class or function block will generate a __vtable_<POU_NAME> struct which will be part of the POU of the following form

TYPE __vtable_<POU_NAME>:
    STRUCT
        // function pointers to inherited, overridden or POU unique methods
    END_STRUCT
END_TYPE

Taking the previous example, the compiler would generate

TYPE __vtable_A:
    STRUCT
        getName: __FPOINTER A.getName := ADR(A.getName);
        printName: __FPOINTER A.printName := ADR(A.printName);
    END_STRUCT
END_TYPE

TYPE __vtable_B:
    STRUCT
        getName: __FPOINTER B.getName := ADR(B.getName);                // Overridden
        printName: __FPOINTER A.printName := ADR(A.printName);          // Inherited
        persistName: __FPOINTER B.persistName := ADR(B.persistName);    // Unique to B
    END_STRUCT
END_TYPE

Calling methods then invoke the virtual table to access the correct method, deciding which method needs to be called at runtime. This happens by transforming the method calls such as ADR(getName()) to __vtable_A#(THIS^.__vtable^).getName^(THIS^)). For child POUs this means upcasting their virtual table to their parent one.

The vtable member fields thereby will be initialized in the corresponding __init function, pointing to some global variable instance. That is, every POU has a global variable instance of form

VAR_GLOBAL
    __vtable_A_instance: __vtable_A;
    __vtable_B_instance: __vtable_B;
END_VAR

Finally, the final form of the previous example as transformed by the compiler would look as follows:

TYPE __vtable_A:
    STRUCT
        getName: __FPOINTER A.getName := ADR(A.getName);
        printName: __FPOINTER B.printName := ADR(A.printName);
    END_STRUCT
END_TYPE

TYPE __vtable_B:
    STRUCT
        getName: __FPOINTER B.getName := ADR(B.getName);                // Overridden
        printName: __FPOINTER A.printName := ADR(A.printName);          // Inherited
        persistName: __FPOINTER B.persistName := ADR(B.persistName);    // Unique to B
    END_STRUCT
END_TYPE

VAR_GLOBAL
    __vtable_A_instance: __vtable_A;
    __vtable_B_instance: __vtable_B;
END_VAR

FUNCTION_BLOCK A
    VAR
        // Note: Only the upper-most base POU, i.e. non-extended POU, will have a `__vtable` entry. Every child POU will access and override it internally.
        __vtable: POINTER TO __VOID := ADR(__vtable_A_instance);
    END_VAR

    METHOD getName: STRING
        getName := 'A';
    END_METHOD

    METHOD printName
        printf('name = %s$N', ADR(__vtable_A#(THIS^.__vtable^).getName^(THIS^)));
    END_METHOD
END_FUNCTION_BLOCK

FUNCTION_BLOCK B EXTENDS A
    VAR
        // Happens internally, accessing the super virtual table overridding it:
        // __A.__vtable := ADR(__vtable_B_instance);
    END_VAR

    METHOD getName: STRING
        getName := 'B';
    END_METHOD

    METHOD persistName
        // Persist name to some file
    END_METHOD
END_FUNCTION_BLOCK

FUNCTION main
    VAR
        result: STRING;
        instanceA: A;
        instanceB: B;

        refInstanceA: POINTER TO A;
    END_VAR

    refInstanceA := ADR(instanceA);
    result := __vtable_A#(refInstanceA^.__vtable^).getName^(refInstanceA^);
    __vtable_A#(refInstanceA^.__vtable^).printName^(refInstanceA^);

    refInstanceA := ADR(instanceB);
    result := __vtable_A#(refInstanceA^.__vtable^).getName^(refInstanceA^);
    __vtable_A#(refInstanceA^.__vtable^).printName^(refInstanceA^);
END_FUNCTION

@volsa volsa mentioned this pull request Jul 17, 2025
9 tasks
@volsa volsa force-pushed the vosa/polymorphism branch 2 times, most recently from f7b7c11 to 04bc60d Compare July 29, 2025 15:44
volsa added 14 commits August 1, 2025 13:15
`POINTER TO <function>` are now indexed in the codegen
Previously when dealing with function pointer calls such as `fnPtr^()`
the resolver would only annotate the operator and any arguments. Some
codegen parts require an annotation on the whole call statement however
(which makes sense), as such this commit fixes the described issue.
Visuallized, assuming `fnPtr` points to a function returning a DINT:
```
fnPtr^();
^^^^^^^^ -> StatementAnnotation::Value { resulting_type: "DINT" }
```
@volsa volsa force-pushed the vosa/polymorphism branch from 04bc60d to e79c95e Compare August 14, 2025 14:30
volsa and others added 7 commits August 26, 2025 09:20
Function pointers can now point to the body of a function block, allowing
for code execution such as
```
FUNCTION_BLOCK A
    VAR
        localState: DINT := 5;
    END_VAR

    METHOD foo
        // ...
    END_METHOD

    printf('localState = %d$N', localState);
END_FUNCTION_BLOCK

FUNCTION main
    VAR
        instanceA: A;
        fnBodyPtr: FNPTR A := ADR(A);
    END_VAR

    fnBodyPtr^(instanceA); // prints "localState = 5"
END_FUNCTION
```
Also inject `__vtable` member variable into "root" POUs
Methods calls which must make use of the virtual table are now
desugared. For example a method call within a method such as `foo(1, 2)`
will now be desugared into `__vtable_{FB_NAME}#(THIS^.__vtable^).foo^(THIS^, 1, 2)`.
Similarly, a variable like `refInstance: POINTER TO FbA` making a method call
such as `refInstance^.foo(1, 2)` will be lowered into some similar form, except
not making use of THIS, rather of the operator name (excluding the method name), i.e.
`__vtable_FbA#(refInstance^.__vtable^).foo(refInstance^, 1, 2)`.
The `__vtable` member field is now initialized with a right hand side of
`ADR(__vtable_<POU_NAME>_instance)` which represents the global instance
variable of a virtual table that is initialized with its function pointers.
@volsa volsa force-pushed the vosa/polymorphism branch from e79c95e to 27a06a5 Compare August 26, 2025 11:22
The `__body` function pointer now points at the function blocks body
method in the virtual table struct. As a result, code execution like
the following is now possible
```
VAR
    instanceA: A;
    instanceB: B; // Child of A

    refInstanceA: POINTER TO A;
END_VAR

refInstanceA := ADR(A);
refInstanceA^(); // Calls body method of A

refInstanceA := ADR(B);
refInstanceA^(); // Calls body method of B
```
@volsa volsa force-pushed the vosa/polymorphism branch from 956a3eb to 213376f Compare August 27, 2025 09:07
@volsa volsa force-pushed the vosa/polymorphism branch 3 times, most recently from b3b3dec to 521ed19 Compare September 1, 2025 15:32
@volsa volsa force-pushed the vosa/polymorphism branch from 521ed19 to 68cdbf7 Compare September 1, 2025 15:55
@volsa volsa requested a review from Copilot September 2, 2025 09:05
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces polymorphism support for structured text by implementing virtual tables and method dispatch. The implementation enables dynamic method calls on function blocks and classes through pointer-based references.

Key changes:

  • Virtual table infrastructure to support polymorphic method calls
  • Method call lowering to use indirect function calls through virtual tables
  • Test cases for various polymorphism scenarios including inheritance chains, method overriding, and reference types

Reviewed Changes

Copilot reviewed 103 out of 107 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/tests.rs Removes classes module reference to fix test organization
tests/lit/single/polymorphism/*.st Comprehensive test suite covering polymorphic scenarios (inheritance, method calls, references)
tests/lit/single/pointer/value_behind_function_block_pointer_is_assigned_to_correctly.st Test for function block pointer assignment behavior
tests/lit/single/oop/*.st Minor formatting improvements for existing OOP tests
tests/lit/multi/extern_C_fb_init/foo.c Updates external C structure to include vtable pointer
tests/lit/correctness/pointers/references/*.st Reference handling tests moved to lit framework
tests/lit/correctness/functions/*.st Function block behavior tests moved to lit framework
tests/lit/correctness/classes/class_reference_in_pou.st Class reference test moved to lit framework
tests/lit/correctness/lit.local.cfg Configuration for lit test framework
tests/integration/snapshots/*.snap Updated snapshots reflecting vtable struct changes
tests/correctness/*.rs Test cases migrated from unit tests to lit framework
src/lowering/vtable.rs New module implementing virtual table generation for polymorphism
src/validation/*.rs Updates to support polymorphic type checking and inheritance validation
src/typesystem.rs Type system extensions for method and function block identification
src/resolver.rs Updates to method resolution for inheritance chains

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@volsa volsa marked this pull request as ready for review September 2, 2025 09:28
%foo7 = getelementptr inbounds %__vtable_A, %__vtable_A* %cast6, i32 0, i32 1
%4 = load i16 (%A*, i32)*, i16 (%A*, i32)** %foo7, align 8
%deref8 = load %A*, %A** %refInstanceA, align 8
%fnptr_call9 = call i16 %4(%A* %deref8, i32 10)
Copy link
Member Author

@volsa volsa Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While writing these tests, I realized that the calls look somewhat "wrong". Here we call call i16 %4(%A* %deref8, i32 10) but the signature in the virtual table of B expects i16 (%B*, i32)*, that is we pass an object of type A to a function pointer that would expect an object of B. Technically wrong, but not really an issue because up- and downcasting for child POUs works. A "much cleaner" version of this however would be consistent function signatures, i.e. inherit the types of the base classes and explicitly type cast at the start of a method. For example

%__vtable_A = type { void (%A*)*, i16 (%A*, i32)* }
-%__vtable_B = type { void (%B*)*, i16 (%B*, i32)* }
+%__vtable_B = type { void (%A*)*, i16 (%A*, i32)* }

and

-define i16 @B__foo(%B* %0, i32 %1) {
+define i16 @B__foo(%A* %0, i32 %1) {
+%this = bitcast %A* %0 to %B*

Not a blocker, just wanted to mention this as a general fyi.

@volsa volsa requested review from ghaith, riederm and mhasel September 2, 2025 11:35
@volsa volsa changed the title feat: Polymorphism feat: Polymorphism (Classes and Function Blocks) Sep 2, 2025
@@ -119,6 +119,7 @@ fn multiple_files_in_different_locations_with_debug_info() {
}

#[test]
#[ignore = "TODO: Works as a lit test but not in here because `compile_with_root` does not register any participants"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't reach the lit test yet, but lit can test IR, should we give that a shot for these tests? Start testing codegen using lit?

},
]
"#);
let statements_str = statements.iter().map(|statement| statement.as_string()).collect::<Vec<_>>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this

//! would generate the following virtual table structures:
//!
//! ```text
//! TYPE __vtable_A:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have vtable_b contain a vtable_a struct? I thought thatt was the initial plan since it can be transparent?
I'm guessing it's easier to just blindly look for the foo member?
I'm just wondering how this works with casting but i also don't think it would be an issue, we could derefernce the vtable_b as vtable_a and we get the first 2 members.

Copy link
Member Author

@volsa volsa Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, from a desugaring POV it's much easier to just blindly call a method rather than find the correct (nested) vtable, then potentially access yet another vtable (and so on) before accessing the call. Hence I opted for the flattened structure. Re casting: As long as the order is correct, casting should be no issue

//! One final thing to note, while the casting of the virtual tables into concrete types is not really
//! interesting per-se, the upcasting from a child to its parent virtual table should at least be mentioned.
//! That is, as long as the virtual table definitions are compatible, upcasting can be performed without any
//! issues. Compatible here refers to the fact that the order of the member fields must be constant. More
//! specifically, the methods must be defined in "ancestral hierarchical order". To illustrate with the
//! previous examples, assume we have
//! ```text
//! TYPE __vtable_A:
//! STRUCT
//! getName: __FPOINTER TO A.getName := ADR(A.getName);
//! printName: __FPOINTER TO A.printName := ADR(A.printName);
//! END_STRUCT
//! END_TYPE
//!
//! TYPE __vtable_B:
//! STRUCT
//! getName: __FPOINTER TO B.getName := ADR(B.getName); // Overridden
//! printName: __FPOINTER TO A.printName := ADR(A.printName); // Inherited
//! persistName: __FPOINTER TO B.persistName := ADR(B.persistName); // New
//! END_STRUCT
//! ```
//!
//! We can safely cast from B to A's virtual table because the layout is compatible and it would result in
//! `persistName` to be cut off in the cast. Were that not the case, e.g. if `getName` were to be swapped with
//! `persistName` then the call to `getName` would silently result in calling `persistName`.

@@ -751,7 +751,11 @@ mod inheritance {
·
5 │ FUNCTION_BLOCK bar EXTENDS foo
│ --- see also
"###);

error[E048]: Could not resolve reference to __vtable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't want to show these errors

Copy link
Collaborator

@riederm riederm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this commit and I think you did a great job!

As I mentioned in the past, I believe we're stressing the lowering here very much while avoiding changes in the code-gen part of the compiler. What I dislike about the lowering is, that this approach does not lower anything to a more low level representation. We stay in the same representation which has several risks - I think this is not healthy, but especially in testing your approach showed its strenghts!

I like your test coverage very much. The overall testing approach and the things you've built to make testing easier are awesome! 🤩

My biggest issue with the current approach is, that it disables all kinds of optimizations, since we treat all 'possibly virtual' calls as defacto virtual. Cpp introduces dedicated IR syntax to tell the compiler backend that this is a vtable, which allows De-virtualization optimizations.
We should at least add an open task to implement something like Static Class Hierarchy Analysis. I think the performance impact is relevant (also code size). Maybe a master-thesis topic!

Your introduction to this PR was very helpful, also the hints on how to review this was great!

After reviewing this PR it is not super clear how the dynamic dispatch calls via interfaces easily fit into this picture. Do you guys already have a plan for this? I would be interested - maybe we can have a call.

@@ -1421,6 +1462,24 @@ impl AstNode {
_ => None,
}
}

pub fn get_base(&self) -> Option<&AstNode> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename these base-helpers to give it more context?
get_base_reference_expr and add a documentation.

#[allow(non_snake_case)]
#[no_mangle]
pub extern "C" fn TON_TIME(timer: &mut TimerParams) {
TON(timer)
}

#[allow(non_upper_case_globals)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder several things:

  1. how do you assure that all stdlib FBs got their vtable right? How do we make sure that we dont forget it in future changes when you add a method to an FB?
  2. should we try to implement as much as possible in ST to bypass this?
  3. could we bypass the whole thing by introducing sealed/final FBs that can not be extended? (I wonder if this offers a lot to the user?) - the side effect is, that everybody pays the cost of a dynamic dispatch for every FB all the time :-(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. could macros help here?

let units = units
.into_iter()
.map(|AnnotatedUnit { mut unit, .. }| {
self.desugar_unit(&mut unit);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename the whole desugaring? I find dynamic dispatch no compiler sugar
I would also not mix the terms desugaring and lowering

entry:
%this = alloca %A*, align 8
store %A* %0, %A** %this, align 8
%__vtable = getelementptr inbounds %A, %A* %0, i32 0, i32 0
Copy link
Collaborator

@riederm riederm Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this %__vtable local here because we grab all pou members onto the stack?

entry:
%this = alloca %A*, align 8
store %A* %0, %A** %this, align 8
%__vtable = getelementptr inbounds %A, %A* %0, i32 0, i32 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also unused %__vtable local

@@ -56,16 +56,16 @@ fn ref_call_in_initializer_is_lowered_to_init_function() {
let units = units.iter().map(|unit| unit.get_unit()).collect::<Vec<_>>();
let init_foo_unit = &units[1].pous[0];

assert_debug_snapshot!(init_foo_unit, @r###"
assert_debug_snapshot!(init_foo_unit, @r#"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: The vtable is initialized via a function, this is a side-effect of generating dedicated ST types declarations for it?
Cpp generates statically initialized globals for the vtables

@mhasel
Copy link
Member

mhasel commented Sep 5, 2025

As we've already discussed in person, I'd like to see some debug-info tests to make sure internal vtable variables/fields do not show up in the debugger.
Also, as seen in a quick manual IR test it looks like we might run into the issue where we need to step multiple times on a single line for polymorphic-call related instructions in function bodies. (this I do not see as a blocker, but something that needs tracking if this suspicion can be confirmed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants